---
title: Spark 核心
description: Apache Spark™ is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
date: 2025-07-23 11:21
tags: ["大数据", "Spark", ]
published: false
status: growing
---

# Spark

# RDD 弹性分布式数据集
RDD（Resilient Distributed Dataset）叫做弹性分布式数据集，是Spark中最基本的数据抽象，代表一个不可变、可分区、里面的元素可并行计算的集合。

## RDD 持久化
